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Abstract 

Background: Insertion elements (IS) are known to play an important role in the evolution and genomic 
diversification of Escherichia coli 0157:H7 lineages. In particular, IS629 has been found in multiple copies in the £ 
coli 0157:H7 genome and is one of the most prevalent IS in this serotype. It was recently shown that the lack of 
0157 antigen expression in two 0 rough £ coli 0157:H7 strains was due to IS629 insertions at 2 different locations 
in the gne gene that is essential for the 0 antigen biosynthesis. 

Results: The comparison of 4 £ coli 0157:H7 genome and plasmid sequences showed numerous IS629 insertion 
sites, although not uniformly distributed among strains. Comparison of IS629s found in 0157:H7 and 055:H7 
showed the presence of at least three different IS629 sub-types. 0157:H7 strains carry IS629 elements sub-type I 
and III whereby the ancestral 055:H7 carries sub-type II. Analysis of strains selected from various clonal groups 
defined on the £ coli 0157:H7 stepwise evolution model showed that IS629 was not observed in sorbitol 
fermenting 0157 (SF0157) clones that are on a divergent pathway in the emergence of 0157:H7. This suggests 
that the absence of IS629 in SF0157 strains probably occurred during the divergence of this lineage, albeit it 
remains uncertain if it contributed, in part, to their divergence from other closely related strains. 

Conclusions: The highly variable genomic locations of IS629 in 0157:H7 strains of the A6 clonal complex indicates 
that this insertion element probably played an important role in genome plasticity and in the divergence of 0157: 
H7 lineages. 



Background 

Enterohemorrhagic Escherichia coli (EHEC) of serotype 
0157:H7 has been implicated in foodborne illnesses 
worldwide. It frequently causes large outbreaks of severe 
enteric infections including bloody diarrhoea, hemorrha- 
gic colitis (HC) and haemolytic uremic syndrome (HUS) 
[1,2]. This serotype constitutively expresses the somatic 
(O) 157 and flagellar (H) 7 antigens, thus, these traits 
are used extensively in clinical settings to identify this 
highly pathogenic serotype [1]. However some 0157:H7 
strains, although being genotipically 0157 or H7 do not 
express either of those antigens [3,4]. According to the 
latest CDC report, E. coli 0157:H7 infections affect 
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thousands of people every year accounting for 0.7%, 4% 
and 1.5%, of illnesses, hospitalizations and deaths, 
respectively of the total U.S. foodborne diseases caused 
by all known foodborne pathogens [5]. 

Previously, we characterized two potentially patho- 
genic O rough:H7 strains that did not express the 0157 
antigen [4,6] but belonged to the most common 0157: 
H7 clonal type. The O rough phenotype was found to 
be due to two independent IS629 insertions in the gne 
gene that encodes for an epimerase enzyme essential for 
synthesis of an oligosaccharide subunit in the O antigen. 
Of the IS elements identified in 0157 strains, IS629 
elements are the most prevalent in this serotype and 
have been confirmed to very actively transpose in 0157 
genomes [7]. The presence of O-rough strains of this 
serotype in food and clinical samples is of concern as 
they cannot be detected serologically in assays routinely 
used to test for 0157:H7 [3]. 
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The occurrence of other atypical 0157:H7 strains due 
to IS629 insertions therefore, might be more common 
than anticipated. It is generally assumed that IS ele- 
ments play important roles in bacterial genome evolu- 
tion and in some cases are known contributors to 
adaptation and improved fitness [7]. The acquisition or 
loss of mobile genetic elements, like IS elements, may 
differ between strains of a particular bacterial species 
[8]. IS insertion and IS-mediated deletions have been 
shown to generate phenotypic diversity among closely 
related 0157 strains [7]. It has been shown that 0157 is 
a highly diverse group and a major factor that effects 
this diversity are prophages [7]. However, in addition to 
prophages, IS629 also appears to be a major contributor 
to genomic diversification of 0157 strains. Therefore, it 
is questionable how much influence IS629 had on the 
evolution of 0157:H7, or how much importance IS629 
has to changes in virulence in this bacterium. 

It has been proposed in an evolutionary model pre- 
viously that highly pathogenic enterohemorrhagic E. coli 
(EHEC) 0157:H7 arose from its ancestor enteropatho- 
genic E. coli (EPEC) 055:H7 (SOR+ and GUD+) 
through sequential acquisition of virulence, phenotypic 
traits, and serotypic change (Al(stx~) I A2(stx2) in 
Figure 1A) [9-11]. After the somatic antigen change 
from 055 to 0157 gave rise to an intermediary (A3) 
which has not yet been isolated, two separate 0157 




/A1/2 \ 

,' 055:H7 
', SOR+ GUD+ • 
\ IS629 II / 



Figure 1 Stepwise evolutionary model for E. coli 01 57:H7 from 
ancestral 055:H7 [1 1], In red letters are the possible events 
happening and where they occurred during the stepwise evolution. 
The circle in gray represents an intermediary A3 CC, which has not 
yet been isolated. SOR - sorbitol fermentation [if (+) fermenting, if 
(-) non-fermenting or slow fermenting]. GUD - f3-D-glucuronidase 
activity. 



clonal complexes evolved, splitting into two diverged 
clonal groups. One of these groups was composed of 
sorbitol fermenting (SF) non-motile 0157:NM strains 
containing plasmid pSF0157 (A4) (SOR+, GUD+). The 
other was composed of non-sorbitol fermenting (NSF) 
0157:H7 strains containing plasmid p0157 (A5) (SOR-, 
GUD+). The latter, by a mutational inactivation of the 
uidA gene, lost its |3 -glucuronidase activity which is the 
most typical 0157:H7 phenotype at present (A6) [11]. 
These A6 strains have spread geographically into dispa- 
rate locales and now account for most of the diseases 
caused by EHEC [12]. 

IS629 seems to play an important role in the diversifi- 
cation of closely related strains, specifically 0157:H7 [7]. 
In the present study, we examined the prevalence of 
IS629 in a panel of E. coli strains, including ancestral 
and atypical strains associated with the stepwise emer- 
gence of E. coli 0157:H7 to determine the prevalence of 
IS629 and its impact on the transitional steps that gave 
rise to today's highly pathogenic E. coli 0157:H7. 

Results 

\S629 prevalence in E. coli 0157:H7 genomes 

The 1S629 sequence, recently found to be inserted into 
the gne gene in E. coli O rough:H7 (MA6 and CB7326) 
[4,13], was used for a BLAST analysis of the genomes 
of 4 E. coli 0157:H7 strains belonging to A6 CC 
(EDL933, Sakai, EC4115 and TW14359) and one 055: 
H7 strain (CB9615) (Additional file 1, Table SI). The 
BLAST analysis for 1S629 showed the presence of 
between 22 and 25 copies in each strain along with 
their corresponding plasmid (Table 1). Strains Sakai 
and EDL933 shared 13 of those 1S629 on the chromo- 
some and three on their p0157 plasmids. Strains 
EC4115 and TW14359 had 17 IS629 on the chromo- 
some and four on their p0157 plasmid in common. 
The analysis of the recently released E. coli 055:H7 
genome strain CB9615 [14] allowed for identification 
of one IS629 with an internal 86 bp deletion on the 
chromosome and an IS629 in its corresponding p055 
plasmid. Neither the 055 genomic (located on the 
chromosome backbone) nor the p055 plasmid IS629 
insertion sites were present in other 0157:H7 strains. 
The absence of the p055 IS629 insertion site in 0157: 
H7 strains was expected since they do not carry the 
p055 plasmid. However, lack of the genomic 055 
IS629 insertion site in 0157:H7 strains is interesting as 
these strains are known to be closely related [14]. Con- 
trary to what was observed for plasmids p0157 and 
p055, IS629 was absent in plasmid pSF0157 {E. coli 
0157:H- strain 439-89). However, a 66 bp sequence 
identical to IS629 was observed in the plasmid which 
could be a remnant of IS629. No genomic sequence is 
available for an 0157:H- strain at this time, thus, this 
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Table 1 Prevalence of \S629 elements in each strain (chromosomes and plasmids) and number of shared \S629 





Strain 


Serotype 


IS629 Sites 




In common with strain 












Sakai 


EDL933 


EC4115 


TW14359 


CB9615 


Chromosomes 




















Sakai 


0157:H7 


19 




13 


9 


9 


0 




EDL933 


0157:H7 


21 


13 




6 


6 


0 




EC4115 


0157:H7 


19 


9 


6 




17 


0 




TW14359 


0157:H7 


21 


9 


6 


17 




0 




CB9615 


055:H7 


1 


0 


0 


0 


0 




Plasmids 








p0157 Sakai 


p0157 EDL933 


p0157 EC4115 


p0157 TW14359 


p055 CB9615 




p0157 Sakai 


0157:H7 


3 




3 


3 


3 


0 




p0157 EDL933 


0157:H7 


3 


3 




3 


3 


0 




p0157 EC4115 


0157:H7 


A 


3 


3 




A 


0 




p0157 TW14359 


0157:H7 


A 


3 


3 


4 




0 




pSF0157 


0157:H- 


0 


0 


0 


0 


0 


0 




p055 CB9615 


055:H7 


1 tr* 


0 


0 


0 


0 





tr* - truncated. 



strain could not be investigated for the presence of 
IS629. 

\S629 target site specificity ("hot spots") on chromosomes 
and plasmids of four £ coli 0157:H7 strains 

The majority of IS629 elements were located on 
prophages or prophage-like elements (62%) ("strain- 
specific-loops", S-loops in Sakai [15]). 28% of IS629 
locations were found on the well-conserved 4.1-Mb 
sequence widely regarded as the E. coli chromosome 
backbone (E. coli K-12 orthologous segment) [15] and 
10% were located on the p0157 plasmid. In total, we 
observed 47 different 1S629 insertion sites (containing 
complete or partial 1S629) in the four E. coli chromo- 
somes and plasmids by "in silico" analysis (Additional 
file 2, Table S2). Seven of 47 IS629 insertion were 
shared among the 4 diverged strains which suggest that 
they were also present in a common ancestor. 

\S629 presence in strains belonging to the stepwise 
model of emergence of E. coli 0157:H7 

A total of 27 E. coli strains (Table 2) belonging to the 
stepwise model proposed by Feng et al. (1998) were 
examined by PCR for the presence of IS629 using speci- 
fic primers [16]. Every strain of clonal complex (CC) 
A6, A5, A2 and Al carried IS629, except strain 3256-97 
belonging to the ancestral CC A2 (Figure 1). Strikingly, 
however, was the observation that IS629 was absent in 
the SF0157 strains belonging to the closely related CC 
A4 (Figure 2). Whole genome analysis of two A4 strains 
(493-89 accession no. AETY00000000 and H2687 acces- 
sion no. AETZ00000000) confirmed the absence of this 
specific IS element in SF0157 strains [17]. On the other 
hand, 055:H7 strain 3256-97 (AEUA00000000) carried 
a truncated IS629 version missing the target area for the 



reverse primer (IS629-insideR) located in ORFB, 
explaining the lack of IS629 by PCR [17]. Additionally, 
strains USDA5905 (A2) and TB182A (Al) as well as 
strain LSU-61 (A?) appear to harbor a truncated IS629 
which could indicate the presence of genomic IS629 
found in the 055 strain CB9615. However, since no 
additional ancestral strains were available for analysis, 
the distribution of IS629 in these groups is at present 
inconclusive. 

\S629 distribution in strains belonging to the stepwise 
model of emergence of E. coli 0157:H7 

We successfully PCR amplified 38 of the 47 observed 
IS629 insertion sites in the 27 0157:H7 strains ana- 
lyzed (Additional file 3, Table S2). We determined pre- 
sence or absence of an IS629 element as well as the 
IS629 target site in each strain (Additional file 1, 
Figure SI). In accordance with the previous finding of 
total absence of IS629 in SF0157, none of the A4 CC 
strains harbored an IS629 in any of the known IS629 
insertion sites. Likewise, it was observed for Al and 
A2 CC strains, indicating that the previously detected 
IS629 must be located in some other region of the 
chromosome. In A5 CC strains, only 3 of the 38 (7%) 
IS629 insertion sites harbored an IS629 (Table 3). 
Those sites were located on the prophage Spl2, the 
prophage-like element SpLEl, and on the chromoso- 
mal backbone. Interestingly one of the A5 CC strains 
(strain 1659) did not share any of the known sites 
harboring IS629. The A6 CC strains shared between 
6 (16%) and 21 (55%) IS629 insertions in the known 
sites and two of them (IS. 15: Spl4 and IS.41: p0157) 
were present in all A6 CC strains. IS629 prevalence in 
the A6 strains and the distribution amongst Sp, SpLE, 
backbone and the pO!57 plasmids did not show any 
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Table 2 Serotype, sequence type, characteristics and isolation information of strains of E. coli used in this study 


No. 


Name Other name 


Serotype 


stx 


Special characteristics 


ST CC Source Year 


Reference 










GUD SOR plasmid 






1 


Sakai BAA 460 


0157:H7 


1, 2 


p0157 


66 A6 Japan 1996 


NC_002695 


2 


EDL 933 700927 


0157:H7 


1,2 


p0157 


66 USA 1982 


AE005174 


3 


EC 4115 


0157:H7 


1, 2 


p0157 


66 USA 2006 


NC_01 1353 


A 


TW 14359 


0157:H7 


1, 2 


p0157 


66 USA 2006 


CP001368 


5 


EDL 931 35150 


0157:H7 


1, 2 


p0157 


66 


[26] 


6 


MA6 


0157:H7 


2 


p0157 


66 Malaysia 1998 


[6] 


7 


550654 


0157:H7 


2 


p0157 


66 USA 2009 




8 


FDA 413 


0157:H7 


2 


p0157 


66 


[10] 


9 


G5101 


0157:H7 


1, 2 


+ p0157 


65 A5 USA 1995 


[11] 


10 


1628 


0157:H7 


1, 2 


+ p0157 


65 


[32] 


11 


1659 


0157:H7 


1, 2 


+ p0157 


65 


[11] 


12 


EC 97144 TW 10707 


0157:H7 


1, 2 


+ + p0157 


65 Japan 1997 


[33] 


13 


EC 96038 TW 10201 


0157:H7 


1, 2 


+ + p0157 


65 


[11] 


14 


EC 96012 TW 10189 


0157:H7 


1, 2 


+ + p0157 


65 


[11] 


15 


493-89 


0157:H- 


2 


+ + pSF0157 


75 A4 Germany 1989 


[11] 


16 


5412-89 


0157:H- 


2 


+ + pSF0157 


75 Germany 1 989 


[34] 


17 


H56929 TW 09159 


0157:H- 


2 


+ + pSF0157 


76 Finland 1999 


[11] 


18 


H56909 TW 09162 


0157:H- 


2 


+ + pSF0157 


76 Finland 1999 


[11] 


19 


H 1085c 


0157:H- 


2 


+ + pSF0157 


76 Scotland 2003 


[11] 


20 


H 2687 


0157:H- 


2 


+ + pSF0157 


76 Scotland 2003 


[11] 


21 


3256-97 TW 07815 


055:H7 


2 


+ + ? 


73 A2 USA 1997 


[11] 


22 


USDA 5905 


055:H7 


2 


+ + ? 


73 USA 1 994 


[26] 


23 


TB 1 82A TW 04062 


055:H7 




+ + ? 


73 A1 USA 1991 


[11] 


24 


DEC5A 


055:H7 




+ + ? 


73 


[11] 


25 


LSU-61 


0157:H7 




+ + ? 


237 ? USA 2001 


NC_002695 


26 


Sakai PF 


0157:H7 


1, 2 


p0157 


66 A6 Japan 1996 


AE005174 


27 


43895 CDC 


0157:H7 


1,2 


p0157 


69 A6 USA 1982 





EDL 933 

stx - shiga toxin gene, GUD - ^-glucuronidase activity, SOR - sorbitol fermentation, ST - sequence type as determined by a combination of seven genes http:// 
www.shigatox.net/stec/cgi-bin/index, CC - Clonal complex [11], ? - Unknown. Sakai PF and 43895 are strain derived after numerous subculture passages from the 
original Sakai and EDL933 strains, respectively. 



A6 A5 A4 A2 Al A1 A2 A? 



Ld wt 1 2 3 4 5 6 7 S 9 10 11 12 13 U 15 16 17 18 19 20 21 23 24 22 25 




Figure 2 Gel-electrophoresis of the PCR products for IS629 
presence in strains belonging to the stepwise model of 
emergence of E. coli 0157:H7. Lanes: Ld, molecular weight ladder 
(Gene Ruler); wt, Blank; 1 - 25, strains numbered according to Table 
2. A1-A6, Clonal complexes, A?, CC unknown. 



specific pattern, however it appears that IS629 
transposes actively in the A6 CC. 

Figure IB shows a maximum parsimony tree obtained 
for A5 and A6 CC strains using IS629 presence/absence 
in the target site and presence/absence of IS629 target 
site (chromosome or plasmid region) (Table 3 and 
Additional file 4, Table S3). Strains belonging to Al, A2, 
and A4 CCs were not included in this analysis because 
they either lack IS629 (A4) or IS629 is located in other 
regions on the chromosome than the ones determined 
for 0157:H7 strains. The parsimony tree allowed to 
separate strains belonging to A5 from A6 strains as pro- 
posed in the stepwise model (Figure 1 and 3A) [10,12]. 
Furthermore, it showed the existence of high diversity 
among A5 and A6 CC strains similar to what has been 
shown by PFGE [11]. The validity of this analysis needs 



Table 3 \S629 element presence/absence in CC strains from the 0157:H7 stepwise evolutionary model 











A6 












AS 








A4 




A2 


A1 


A? 


A6 


NR 


Phage 
Or 

backbone 


1 


2 


3 4 


5 


6 


7 


8 


9 


10 11 12 


13 


14 


15 


16 17 


18 


19 20 21 


22 23 


24 25 


26 27 



15.1 Sp4 ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND 

15.2 Sp4 ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND 
S. 3 Sp 5 stx2 + - - - ________ _____ ________ + 

S. 4 SpLE 1 - + - - -- + -_--_ _____ __________ 

S. 5 SpLE 1 + + ++ + - + -- -__ _____ _____ __ + + 

S. 6 SpLE 1 - + - - -- + -_-__ _____ __________ 

S. 7 SpLE 1 + + ++ + - + + -- -- - -- -- ........ + + 

S. 8 Sp 8 + - + - + - + - - + - 

S. 9 Sp 8 - + - - 



+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


ND 


ND 


ND 


ND 



+ 



S. 10 back + + -- + - + + -- -- - -- -- ________ + + 

S. 1 1 back + + -- + -- -_-__ _____ ________ + + 



S. 12 Sp 12 + -- - _______ + + + ___ ________ + _ 

S. 13 back + + ++ + + + -- -- - . . . . . ________ + + 

S. 14 Sp 13 + + + + + + - + ---- ----- _____ _ _ + + 



S. 15 Sp 14 + + + + + + + + ---- _____ ________ + 

IS.16 SpLE 2 ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND 
S. 17 back + + -- - -_-_-__ _____ ________ + + 

S. 18 Sp 15 stxl + + -- + _______ _____ ________ + + 

S. 19 back + -+ + + __ + ____ _____ ________ + _ 

S. 20 Sp 17 + - + + +- + + ---- _____ ________ + 

S. 21 SpLE3 + + -+ + + + + -- -- - -- -- ________ + + 

IS.22 back ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND 
S. 23 SpLE 5+ + -- + - + + -- -- _____ ________ + + 

+ 
+ 



S. 24 SpLE 1 + - - 

S. 25 SpLE 1 - + - - ----_--_ _____ ________ 

IS.26 9330 ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND ND 



V. 3 
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I £ 
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Table 3 \S629 element presence/absence in CC strains from the 0157:H7 stepwise evolutionary model (Continued) 



IS. 27 


SpLE 2 


- 


+ 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


+ 


IS.28 


933Y 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


S. 29 


Sp 1 


- 


- 


+ 


+ 


- 


- 


- 


+ 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


S. 30 


Sp 4 


- 


- 


+ 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


S. 31 


Phage 


- 


- 


+ 


+ 


- 


- 


- 


- 


+ 


+ 


- 


+ 


+ 


+ 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


S. 32 


back 


- 


- 


+ 


+ 


- 


- 


- 


+ 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


S. 33 


Sp 13 


- 


- 


+ 


+ 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


S. 34 


back 


- 


- 


+ 


+ 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


S. 35 


Sp 5 stx 2 


- 


- 


+ 


+ 


- 


- 


- 


+ 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


- 


IS.36 


back 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


S. 37 


Phage 






+ 


+ 












































+ 




S. 38 


back 






+ 


+ 
















































S. 39 


{gne gene) 












+ 












































S. 40 


p0157 


+ 








+ 










































+ 




S. 41 


p0157 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 




































+ 


+ 


S. 42 


p0157 






+ 


+ 












































+ 


+ 


IS.43 


p0157 
























































S. 44 


p0157 






+ 


+ 
















































S. 45 


p0157 








+ 
















































S. 46 


back 








+ 










+ 


+ 




































IS.47 


back 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


IS.48 


p0157 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 


ND 



\S629 sites were numbered from 1 - 47 (NR) starting with all sites in Sakai, followed by all additional, unshared sites from EDL933, EC4115, the sites found in the plasmids and unshared sites of strain TW1435. The 
newly found \S629 insertion in O rough.Hl strain MA6 was numbered IS.39 [4]. A1 - A6 are strains belonging to the different clonal complexes. 
Sp - Phage; SpLE - Phage-like element; back - backbone; ND -Not determined, primers failed to amplify the region. 
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Figure 3 Evolutionary significance of IS629 in the emergence of E. coli 0157:H7. A) Maximum parsimony tree obtained using the 
distribution of \S629 and \S629 target sites in the 14 0157:H7 strains analyzed in the present study (Table 3 and Additional file 4, Table S3). B) 
Maximum parsimony tree obtained using IS629 target sites for the 27 strains analyzed in the present study (Additional file 4, Table S3). The 
colored ellipses mark the different CCs. CC - clonal complex; ST - sequence type. 



to be explored further using more 0157:H7 strains 
belonging to either A5 or A6 CCs. Besides using 25 dif- 
ferent strains for the analysis, we also included addi- 
tional Sakai and EDL933 strains. Sakai strains were one 
from ATCC (BAA-460) and the other from a personal 
collection (FDA). EDL933 strains were provided by 
ATCC whereby strain EDL933 700927 derived from 



EDL933 43895. PFGE analysis showed only minimal 
changes between the original (ATCC) and the derived 
ones confirming their identity (data not shown). The 
analysis using the IS629 distribution also showed mini- 
mal changes in the IS629 distribution as well among the 
Sakai and EDL933 strains. The use of IS629 presence/ 
absence in specific regions has been used before to help 
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detecting outbreak related strains as described by Ooka, 
et al (2009) and for population genetics analysis as 
described by Yokoyama, et al (2011), and appears to be 
a promising and adequate technique to distinguish clo- 
sely related 0157:H7 strains. However, both methodolo- 
gies takes no notice of the information about the 
presence of the region where IS629 was inserted into. 
The presence/absence of a specific region in E. coli 
0157:H7 chromosomes, irrelevant of the presence of 
IS629, could provide additional information regarding 
relatedness among those strains. 

\S629 insertion site prevalence in the strains belonging to 
the stepwise model of emergence of E. coli 0157:H7 

PCR analysis for the presence of IS629 insertion sites 
showed that sites located on the chromosomal backbone 
structure were present in all tested strains from the dif- 
ferent clonal complexes (Table 4 and Additional file 4). 
However, neither Al, A2, nor A4 CC strains harbored 
any IS629 in backbone IS629 insertion sites. 

Contrary to what was observed in the well-conserved 
backbone, IS629 insertion sites in prophages and proph- 
age-like elements in different strains were found to be 
highly variable (Table 5 and Additional file 4, Table S3). 
As seen for the backbone IS629 insertion sites, some of 
the phage associated IS629 insertions sites were present 
in Al, A2 and A4 CC strains; however they lacked 
1S629. Many of the IS629 sites on phages were unique 
to the A6 CC strains (7 of 13) suggesting that they are 
strain-specific. This result underscores significant differ- 
ences in the presence of phage-related sequences 
between the strains belonging to the stepwise model of 
E. coli 0157:H7. 

The two IS629 insertions in 055 and its correspond- 
ing plasmid p055 were observed to be present in only 
one ancestral A2 and both Al CC strains (data not 
shown). A6, A5, and A4 CC strains as well as A2 CC 
strain 3256-97 (IS629-deficient) lacked the IS629 inser- 
tion site in these regions. Interestingly, strain LSU-61 



Table 4 Presence of \S629 target sites on the backbone 



\S629 target sites 


A1 


A2 


A3 


A4 


A5 


A6 


IS.10 


+/- 


+ 


NA 


+ 


+ 


+/- 


IS.11 


+ 


+ 


NA 


+ 


+ 


+ 


IS.13 


+ 


+ 


NA 


+ 


+ 


+ 


IS.17 


+ 


+ 


NA 


+ 


+ 


+ 


IS.19 


+ 


+ 


NA 


+ 


+ 


+ 


IS.32 


+ 


+ 


NA 


+ 


+ 


+ 


IS.34 


+ 


+ 


NA 


+ 


+ 


+ 


IS.38 


+ 


+ 


NA 


+ 


+ 


+ 


IS.39 


+ 


+ 


NA 


+ 


+ 


+ 


IS.46 






NA 


+/- 


+ 


+ 



NA, not applicable; + presence; - absence; +/- present in some strains. 



Table 5 Presence of phage or phage-like associated IS629 
target sites 



\S629 target sites 


Al 


A2 


A3 


A4 


A5 


A6 


Sp 1 






NA 






+ 


Sp 2 


+ 


+ 


NA 


+ 


+ 


+ 


Sp 4 


+ 


+ 


NA 


+ 


+ 


+ 


Sp5 






NA 






+ 


Sp 8 






NA 






+ 


Sp 12 




+ 


NA 


+ 


+ 


+ 


Sp 13 






NA 






+ 


Sp 14 






NA 


+ 


+ 


+ 


Sp 17 






NA 






+ 


SpLE 1 






NA 




+ 


+ 


SpLE 2 






NA 






+ 


SpLE 3 






NA 






+ 


SpLE 5 






NA 




+ 


+ 



Sp - Phage; SpLE - Phage-like element; NA - not applicable; + presence; - 
absence. 



which carries multiple characteristics for 0157:H7 and 
is thought to be ancestral to A5 CC strains (Feng et al 
2007), appeared to carry the truncated genomic IS629 
insertion. 

Since the strains belonging to the stepwise model 
share variable IS629 insertion sites we reconstructed 
their evolutionary path using this information. A parsi- 
mony tree using the IS629 target sites presence/absence 
produced a tree that was nearly analogous to the pro- 
posed model of stepwise evolution for 0157:H7 from 
ancestral 055:H7 strains [10], with A1/A2 CC strains at 
the base of the tree, followed by A4 CC, A5 CC and A6 
CC strains in that order (Figure 3B). 

Phylogenetic analysis of IS629 elements in the four E. coli 
0157:H7 and 055:H7 genomes 

The phylogenetic analysis of IS629 elements revealed 
that 1S629 in E coli 0157:H7 can be divided into three 
different sub-types (Figure 4). That is, IS629 of sub-type 
I and II differ in average 4% (> 55 bp) while sub-type II 
and III differed by 5% (> 60 bp). Sub-type I appears to 
be most closely related to those of IS1203 (IS629 iso- 
form) found in OHl:H- [18]. IS629 sub-type II appears 
to be most closely related to those of IS629 found in 
Shigella [19]. IS629 sub-type III appears to be most clo- 
sely related to those of IS629 found in E. coli 026:H11 
[20]. Therefore, analysis of all targeted IS629 elements 
showed that strains from A6 CC seem to carry both 
ISi203 (sub-type I) and IS629 (sub-type III) whereby the 
ancestral 055:H7 strain carries IS629 (sub-type II). 
Since IS629 sub-type II found in the ancestral 055:H7 
strain is significantly different from the other two IS629 
sub-types (0157:H7 strains) and sub-type II is no longer 
present in certain 0157:H7 strains (A6 CC), these data 
imply that IS629 sub-type I and III were recently 
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1S629 sub-type III 



Figure 4 Phylogenetic tree of IS629 in E. coli 0157:H7 and 055:H7 showing the three different IS629 sub-types present on those five 
genomes. \S629 sub-type I differed from sub-type II by 4% {> 55 bp) and sub-type II differed from sub-type III by 5% (> 60 bp). \S629 sub-type II 
was only present in 055:H7 genome (A1/A2 CC) while sub-type I and III were present in all 0157:H7 genomes (A6 CC). The evolutionary history 
was inferred using the Minimum Evolution method [31]. The tree is drawn to scale, with branch lengths in the same units as those of the 
evolutionary distances used to infer the phylogenetic tree. Bootstrap support when above 50% is shown at nodes. Sp- prophages; SpLE - 
prophage-like elements; and back - backbone. 
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acquired by E. coli 0157:H7 strains after the separation 
from the sub-lineage leading to the A4 CC strains there- 
fore not carrying IS629. 

Discussion 

IS elements are in general regarded as genetic factors 
that significantly contribute to genomic diversification 
and evolution [7]. It was determined by Ooka et al 
(2009) that IS elements IS629 and IS£c8, found in the 
0157:H7 lineage, serve as an important driving force 
behind the genomic diversity. However, only a few gen- 
ome-wide studies have been conducted to compare IS 
distributions in closely related genomes. In our study we 
determined that 1S629 insertions in E. coli 0157:H7 are 
widespread distributed on the genome and differ signifi- 
cantly from strain to strain. Although the ancestral 055: 
H7 strain carried only two IS629 with one on the chro- 
mosome and one on the p055 plasmid, the four 0157: 
H7 genomes carried between 22 and 25 IS629 copies on 
the chromosome and the corresponding p0157 plasmid. 

IS629 does not seem to specifically integrate in 
sequence-based target sites, which explains the highly 
diverged flanking sites found in the genomes we exam- 
ined. Sequence-specific insertion is exhibited to some 
degree by several elements and varies considerably in 
stringency [21]. Other elements exhibit regional prefer- 
ences which are less obvious to determine [21]. IS ele- 
ments frequently generate short target site duplication 
(TSD) flanking the IS upon insertion [21]-this feature 
was also observed for 1S629 in the four 0157:H7 strains. 
1S629 duplicated between 3 to 4 base pairs at the inser- 
tion site and was observed for 21 of the 47 IS629 inser- 
tion sites with matching identical base pairs up- and 
down-stream of 1S629. A comparison of 21 TSDs cre- 
ated by IS629 in the four strains analyzed here did not 
reveal as many similarities as observed previously by 
Ooka et al (2009). The comparison of 25 bp up- and 
downstream of each insertion site did not show any 
similarities or patterns which would have suggested a 
target preference or "hot-spot" for IS629 insertions. 
Hence, insertion site specificity for IS629 remains 
unknown. However, IS629 is frequently surrounded by 
other IS elements ('IS islands') and was found in the 
same gene (gne) inserted in different sites [4,13]. 
Although no specific "hot-spot" for IS629 insertions was 
observed, it seems highly possible that mobile elements 
like plasmids, phages or phage-like elements could have 
functioned as vectors for IS629 introduction into 0157: 
H7 genomes. These observations suggest that an inser- 
tion might occur preferentially in a region of the chro- 
mosome however these events may not be sequence 
specific. 

IS629 insertion sites located on the backbone seem to 
be conserved in almost all of the strains studied here, 



whereby sites located on phages and phage-like areas 
appear to differ between all strains. These findings 
affirm the presence of regions of genomic stability and 
regions of genomic variability that exist within 0157:H7 
populations and closely related strains. It is noteworthy 
that sites associated with phages seem to be present pre- 
dominantly in closely related strains. The majority of the 
phages present in the A6 CC strains appear to be 
unique to this complex. Since bacteriophages are known 
to contribute to the diversification of bacteria [22], they 
seem to be a major determinant in generating diversity 
among 055:H7, 0157:H- and 0157:H7 strains. The 
comparison of IS629 prevalence in A5 and A6 CC as 
well as IS629 insertion site prevalence in all strains 
allowed distinguishing strains from different complexes 
as it has been proposed in the evolution model for 
0157:H7 (Figure 1A) [11]. Adding the "same" strain 
from different collections, Sakai and EDL933 allowed 
confirmation of the stability of 1S629 sites. Minimal 
changes in IS629 presence/absence were observed and 
could have occurred due to different storage conditions 
and passages. Despite these subtle changes, strains 
grouped tightly together on the parsimony tree. There- 
fore, this analysis can be used to further distinguish clo- 
sely related 0157:H7 strains. These findings are in 
agreement with a recently described IS629 analysis in 
three 0157 lineages [23]. Similarly to what was deter- 
mined for A6 and A5 CC strains, Yokoyama et al (2011) 
determined that IS629 distribution was biased in differ- 
ent 0157 lineages, indicating the potential effectiveness 
of IS-printing for population genetics analysis of 0157. 
Furthermore, Ooka et al. (2009) found that IS-printing 
can resolved about the same degree of diversity as 
PFGE. Since Al, A2 and A4 CC strains did not share 
IS629 insertions, their population genetics analysis how- 
ever, remains limited to closely related 0157:H7 strains. 

Comparison of IS629s found in 0157:H7 and 055 
pointed out extensive divergence between these ele- 
ments. At least three different IS629 types could be dis- 
tinguished differing in 55 to 60 bp. The 0157:H7 strains 
carry 1S629 elements subtype I and III whereby 055:H7 
carries type II only. It is notable that only four 
nucleotide differences were observed among seven 
housekeeping genes comprising a current MLST scheme 
http://www.shigatox.net/ecmlst/cgi-bin/dcs between Al 
CC strain DEC5A and A6 CC strain Sakai. These two 
strains, in particular, are taken to represent the most 
ancestral and most derived E. coli, respectively, in the 
stepwise evolutionary model for this pathogen. If the 
IS629 type I and III observed in A6 CC strains resulted 
from divergent evolution of IS629 type II, the amount of 
changes observed among these IS types should be simi- 
lar to those observed for the MLST loci examined 
above. However, the number of nucleotide substitutions 



Rump et al. BMC Microbiology 201 1, 11:133 
http://www.biomedcentral.eom/1 471 -2 1 80/1 1 /1 33 



Page 11 of 1 3 



between IS629 type I and III in 0157:H7 from type II in 
055:H7 was 10-fold higher. Thus, the differences 
between IS629 types are more significant than those 
observed for housekeeping genes. This indicates that 
IS629-type II was most likely lost and IS629-type I and 
III were acquired independently in distinct E. coli 0157: 
H7 lineages. Further supporting this thesis was the fact 
that one of the IS629 type II copies was found on the 
p055 plasmid, which was subsequently lost during evo- 
lution towards 0157:H7 strains. The other IS629 copy 
in 055, with a unique internal deletion, is located in the 
chromosome and appears to be part of a mobile region 
[24] which is absent in 0157:H7 strains. 

Interestingly, the ancestral IS629-deficient A2 055:H7 
strain 3256-97 is also lacking both IS629 associated 
regions found in the 055:H7 strains. Our analysis of 
common 1S629 target sites demonstrated that strain 
3256-97 seems to be more closely related to A4 and A5 
CC strains than other Al and A2 strains. Therefore, it is 
likely that IS629 has been lost in strain 3256-97 as well as 
in the hypothetical A3 precursor. These results may indi- 
cate that strain 3256-97 or a similar strain lacking IS629 
might have given rise to IS629-deficient A4 CC strains. 

E. coli 0157:H7 strains carry multiple IS629 copies 
while the non-pathogenic K-12 strain lacks 1S629 but 
carries other IS elements. Other pathogenic E. coli 
strains, amongst the top six non-0157 STEC 026:H11, 
OHl:H- and O103:H2 [25], also harbor various copies 
of IS629 elements in their genomes. Genome sequences 
for the other three most important pathogenic non- 
0157 STEC; 045, 0145, and 0121 are not available to 
date thus the presence of IS629 elements is unknown. 
Interestingly, they also share the same reservoir with 
0157:H7 (e.g. cattle), shiga-toxins, haemolysin gene clus- 
ter, other virulence factors and several phages and 
phage-like elements [25]. Ooka et al (2009) postulated 
that IS-related genomic rearrangements may have signif- 
icantly altered virulence and other phenotypes in 0157 
strains. These findings suggest that IS629 might not 
only have a great impact in their genomic evolution but 
might increase the pathogenicity of those strains as well. 

Conclusions 

The genomic sequence analysis showed that IS629 inser- 
tion sites exhibited a highly biased distribution. IS629 
was much more frequently located on phages or proph- 
age-like elements than in the well-conserved backbone 
structure, which is consistent with the observations by 
Ooka et al (2009). IS629 was found to be present in the 
Al and one of two A2 CC strains examined as well as 
in all the 0157:H7 strains of A5 and A6 CC, however it 
was totally absent in the 6 examined SF0157 strains of 
A4 CC. The A4 CC strains are related to but on a diver- 
gent evolution pathway from 0157:H7. These results 



suggest that the absence of IS629 in A4 strains probably 
occurred during the divergence, but it is uncertain if it 
contributed to the divergence. Overall, IS629 had great 
impact on the genomic diversification of the E. coli 
0157:H7 lineage and might have contributed in the 
emergence of the highly pathogenic 0157:H7. 

Methods 

Bacterial strains 

The bacterial strains used in this study are listed in 
Table 2 and were chosen to represent typical EHEC and 
EPEC strains from the different clonal complexes from 
the evolution model for E. coli 0157:H7 [11] with differ- 
ent serotypes (0157:H7, 0157:H- and 055:H7) and dif- 
ferent characteristics (e.g. P-glucuronidase activity 
(GUD), sorbitol fermentation (SOR). 

"In silico" analysis 

Various E. coli 0157:H7 and non-0157 chromosomes 
and p0157 plasmids (Additional file 2, Table SI) 
deposited at the National Center for Biotechnology Infor- 
mation (NCBI) database were queried for IS629 (acces- 
sion number X51586) presence and insertion loci using 
BLAST analysis. Furthermore, approximately 400 bp up- 
and downstream of the flanking regions of each new 
localized IS629 in the chromosome and the plasmids 
were compared with each other. We investigated whether 
an IS629 was also present in the other strains or appears 
exclusively in either the chromosome or the plasmids. 

Nucleic acid extraction and determination 
of \S629 presence 

DNA used as the template for PCR was prepared from 
overnight cultures grown in Luria-Bertani Broth (LB) 
and purified using the MASTER PURE™ DNA Purifica- 
tion kit (EpiCentre, Madison, WI). For determining 
IS629 presence in the E. coli strains, we conducted a 
"touchdown" multiplex PCR using IS629-specific 
primers targeting conserved regions of the insertion 
element previously described by Ooka et al. (2009): IS629- 
insideF (5'- GAACGTCAGCGTCTGAAAGAGC-3') and 
IS629-insideR (5'- GTACTCCCTGTTGATGCCAG-3') 
and specific 16S rDNA primers: SRM86 (5'- AGAAG- 
CACCGGCTAACTC -3') [7] and SRM87 (5'- CGCATTT- 
CACCGCTACAC-3') [26]. The latter were used as 
internal amplification control. PCR amplifications were 
performed using 0.5 ng of template DNA and in a final 
volume of 30 |il. The PCR reaction mixture contained 2.5 
U of HotStart Taq Polymerase (Qiagen, Valencia, CA), IX 
Taq polymerase buffer, 2.0-3.5 mM MgCl 2 , 400 uM each 
deoxynucleoside triphosphate (dNTP), 300 nM each IS629 
primer pair, and 300 nM each 16S rDNA primer pair. The 
"touchdown" PCR [27] conditions were: 1 cycle of 95°C 
for 15 min; 10 cycles of 95°C for 30 s, 69-59°C (-l°C/cycle) 
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for 15 s and 72°C for 1:30 min; followed by 35 cycles con- 
sisting of 95°C for 30 s, 58°C for 20 s, and 72°C for 1.5 
min, and a final extension at 72°C for 4 min. Amplicons 
were visualized on a 1% agarose gel in Tris-Borate EDTA 
(TBE) buffer containing 0.3 ug/ml ethidium bromide. 

Determination of \S629 specific location and \S629 
insertion sites 

For the analysis of the 1S629 insertion sites, primers 
were designed to target the different IS629 flanking 
regions in each strain and the plasmids. The presence/ 
absence of amplicons would determine the presence/ 
absence of the specific insertion sites and the sizes of 
each amplicons would indicate the presence/absence of 
IS629 at those loci. Potential primers were analyzed for 
their ability to produce stable base pairing with the tem- 
plate using the NetPrimer software (PREMIER Biosoft 
International http://www.premierbiosoft.com/ netprimer/ 
netprlaunch/netprlaunch.html). The size of the PCR 
products were between 1,500 - 2,500 bp in the case of 
1S629 presence in a strain or between 200 - 800 bp in 
the case that the specific flanking region existed in the 
chromosome but did not contain an 1S629 element. 
Each multiplex PCR contained a set of 16S rDNA pri- 
mers as PCR internal control (either set SRM86/SRM87 
or VMP5 (5'-AGAAGCACCGGCTAACTC-3') and 
VMP6 (5'-CGCATTTCACCGCTACAC-3') [28]), and 
1S629 insertion site specific primers. The list of the 40 
primer combinations for each 1S629 site and PCR condi- 
tions can be found in Additional file 5, Table S4. 

\S629 presence/absence parsimony tree analysis 

1S629 PCR fragments sizes indicating IS629 presence/ 
absence and 1S629 target site presence/absence identified 
by PCR using primers specific for each IS629 observed in 
4 E. coli 0157:H7 genomes were entered as binary char- 
acters (+ or -) into BioNumerics version 6.0 (Applied 
Maths, Saint-Martens-Latem, Belgium). IS629 presence/ 
absence and IS629 target site presence/absence were 
used to create a phylogenetic parsimony tree rooted to 
A5 CC strains for A5/A6 CC strains analysis (Figure IB) 
and statistical support of the nodes was assessed by 1000 
bootstrap re-sampling. IS629 target site presence/absence 
were used to create a phylogenetic parsimony tree rooted 
to A1/A2 CC strains for strains of the entire model (Al - 
A6) (Figure 1C) and statistical support of the nodes was 
assessed by 1000 bootstrap re-sampling. 

\S629 phylogenetic analysis 

Minimum evolution tree for IS629 sequences present in 
4 E. coli 0157:H7 genomes, two IS629 in 055:H7 gen- 
ome, IS629 sequences from Shigella, two other IS629 
isoforms (IS1203 and IS3411), and \SPsy21 (a member 



of the IS3 family and sharing only 68% homology with 
IS629) as out-group (Pseudomonas syringae pv. savasta- 
noi TK2009-5) was constructed using Mega version 4.0 
[29]. The evolutionary distances were computed using 
the Kimura 2-parameter method [30] and are in the 
units of the number of base substitutions per site. All 
positions containing gaps and missing data were elimi- 
nated from the dataset (Complete deletion option). 
There were a total of 299 positions in the final dataset. 
The statistical support of the nodes in the ME tree was 
assessed by 1000 bootstrap re-sampling. 

Additional material 
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Additional file 1: "Figure SI". Schematic representation of the strategy 
used for primer design. Primer pairs: A: presence/absence of IS629 at 
specific loci, B: IS629 internal primer. A) Amplification product for 
locations where the IS629 element is present; B) Amplification product 
for locations where the IS629 element is absent, although the up-and 
downstream flanking region is present in the genome but not carrying 
an insertion. 

Additional file 2: 'Table SI". Genomes and plasmids investigated by 
"in silico" analysis. 

Additional file 3: 'Table S2". IS629 insertion sites in 0157:H7 strains 
with complete genomes available in Genbank (Additional Table 1). In 
bold are the locations shared by the four 0157:H7 strains. The direct 
repeats (duplication are in red). IS629 sites were numbered from 1 - 47 
starting with all sites in Sakai, followed by all additional, unshared sites 
from EDL933, EC41 15, the sites found in the plasmids and unshared sites 
of strain TW1435. The newly found IS629 insertion in O rough:H7 strain 
MA6 was numbered IS. 39. 

Additional file 4: 'Table S3". IS629 target site presence/absence in CC 
strains from the 0157:H7 stepwise evolutionary model. 

Additional file 5: 'Table S4". Primer sequences for the amplification of 
each flanking IS629 regions on the four £ coli genomes available (see 
Additional Table 2). If IS absent size equal to 0 bp means that the primer 
pair was designed with one target region inside IS629 therefore the 
IS629 target site could not be observed. 
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